35 research outputs found

    Analysis of the proper nano environment for nuclearing and maintaining the secondary structure elements in the structural context for functional proteins

    Get PDF
    Orientador: Goran NeshichTese (doutorado) - Universidade Estadual de Campinas, Instituto de BiologiaResumo: As proteínas exercem um papel vital na manutenção da vida. Entre as diversas funções que as proteínas têm, destacam-se, por exemplo: proteínas estruturais, de transporte, proteção e defesa, controle e regulação de expressão, catálise, movimento e armazenamento. Para um entendimento melhor da relação entre a sequência de aminoácidos de uma proteína, sua estrutura tridimensional e a função desempenhada por ela, foi proposta a análise do nano-ambiente proteico onde os elementos de estrutura secundária alfa-hélice, folha-beta e turn estão inseridos. A hipótese que motivou essa abordagem é a existência de um "sinal", ou seja, uma variação nos valores dos descritores físico-químicos e estruturais que distinguem o local específico onde determinado elemento de estrutura secundária está inserido no arcabouço da inteira proteína. Entender como são formados os elementos de estrutura secundária abrirá o caminho para compreendermos como as proteínas assumem sua estrutura final, e consequentemente, sua função. Neste trabalho utilizamos o STING_RDB , uma base de dados única no mundo, que reúne em um único repositório mais de 1500 descritores físico-químicos e estruturais de todos os resíduos de aminoácidos para cada cadeia de todas as estruturas proteicas depositadas no PDB (Protein Data Bank). As estruturas armazenadas no STING_RDB foram separadas em diferentes Data Marts, que são porções extraídas da inteira base de dados, após uma seleção rígida. As estruturas selecionadas e guardadas nesses Data Marts foram então alinhadas posicionalmente pelo respectivo elemento de estrutura secundária, e posteriormente extraíram-se desses alinhamentos os dados referentes aos descritores físico-químicos e estruturais que descrevem o nano-ambiente onde se insere o elemento de estrutura secundária. Esse processo foi usado na busca dos "sinais". Este trabalho descreve como os dados contidos nesses Data Marts foram selecionados, preparados, analisados e interpretados. Baseado nos resultados obtidos, concluímos que o nano-ambiente pode ser descrito não por um descritor, mas por um conjunto de descritores, e que essa descrição varia de acordo com o elemento de estrutura secundária estudado. Isso diferencia o nano-ambiente do restante da proteína, e não apenas entre os diferentes tipos de elementos de estrutura secundáriaAbstract: Proteins play a vital role in maintaining life. Among the various functions that proteins have, one may, for example, cite those such as structural proteins, transport, protection and defense, control and regulation of expression, catalysis, movement, and storage. For a better understanding of the relationship between the amino acid sequence of a protein, its structure and the function performed by it, it was proposed the analysis of the protein nano environment where alpha-helix, beta-beta and turn secondary structure elements are inserted. The hypothesis that motivated this approach is the existence of a "signal", that is, a variation in the values of the physical-chemical and structural descriptors that distinguish the specific site where the secondary structure element is inserted in the framework of the entire protein. Understanding how the secondary structure elements are formed will open the way to understanding how proteins assume their final structure and hence their function. In this work we use STING_RDB, a unique database, which brings together in a single repository more than 1500 physical-chemical and structural descriptors of all amino acid residues for each chain of all protein structures deposited in the PDB (Protein Data Bank). The structures stored in STING_RDB have been separated into different Data Marts, which are portions extracted from the entire database, after a rigid selection. The structures selected and stored in these Data Marts were aligned by the respective secondary structure element, and subsequently, the data referring to the physical-chemical and structural descriptors that describe the nano-environment where the secondary structure element is inserted are extracted from these positional alignments. This process was used in the search for "signs". This paper describes how the data contained in these Data Marts were selected, prepared, analyzed and interpreted. After this extensive work, we conclude that the nano-environment can be described not by a single descriptor, but by a set of descriptors and that this description varies according to the secondary structure element studied. Also, any of the nanoenvironments suitable for the studied secondary element is different from the rest of the proteinDoutoradoBioinformaticaDoutor em Genetica e Biologia Molecula

    Analysis of binding properties and specificity through identification of the interface forming residues (IFR) for serine proteases in silico docked to different inhibitors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Enzymes belonging to the same super family of proteins in general operate on variety of substrates and are inhibited by wide selection of inhibitors. In this work our main objective was to expand the scope of studies that consider only the catalytic and binding pocket amino acids while analyzing enzyme specificity and instead, include a wider category which we have named the Interface Forming Residues (IFR). We were motivated to identify those amino acids with decreased accessibility to solvent after docking of different types of inhibitors to sub classes of serine proteases and then create a table (matrix) of all amino acid positions at the interface as well as their respective occupancies. Our goal is to establish a platform for analysis of the relationship between IFR characteristics and binding properties/specificity for bi-molecular complexes.</p> <p>Results</p> <p>We propose a novel method for describing binding properties and delineating serine proteases specificity by compiling an exhaustive table of interface forming residues (IFR) for serine proteases and their inhibitors. Currently, the Protein Data Bank (PDB) does not contain all the data that our analysis would require. Therefore, an <it>in silico </it>approach was designed for building corresponding complexes</p> <p>The IFRs are obtained by "rigid body docking" among 70 structurally aligned, sequence wise non-redundant, serine protease structures with 3 inhibitors: bovine pancreatic trypsin inhibitor (BPTI), ecotine and ovomucoid third domain inhibitor. The table (matrix) of all amino acid positions at the interface and their respective occupancy is created. We also developed a new computational protocol for predicting IFRs for those complexes which were not deciphered experimentally so far, achieving accuracy of at least 0.97.</p> <p>Conclusions</p> <p>The serine proteases interfaces prefer polar (including glycine) residues (with some exceptions). Charged residues were found to be uniquely prevalent at the interfaces between the "miscellaneous-virus" subfamily and the three inhibitors. This prompts speculation about how important this difference in IFR characteristics is for maintaining virulence of those organisms.</p> <p>Our work here provides a unique tool for both structure/function relationship analysis as well as a compilation of indicators detailing how the specificity of various serine proteases may have been achieved and/or could be altered. It also indicates that the interface forming residues which also determine specificity of serine protease subfamily can not be presented in a canonical way but rather as a matrix of alternative populations of amino acids occupying variety of IFR positions.</p

    The complete genome sequence of Chromobacterium violaceum reveals remarkable and exploitable bacterial adaptability

    Get PDF
    Chromobacterium violaceum is one of millions of species of free-living microorganisms that populate the soil and water in the extant areas of tropical biodiversity around the world. Its complete genome sequence reveals (i) extensive alternative pathways for energy generation, (ii) ≈500 ORFs for transport-related proteins, (iii) complex and extensive systems for stress adaptation and motility, and (iv) wide-spread utilization of quorum sensing for control of inducible systems, all of which underpin the versatility and adaptability of the organism. The genome also contains extensive but incomplete arrays of ORFs coding for proteins associated with mammalian pathogenicity, possibly involved in the occasional but often fatal cases of human C. violaceum infection. There is, in addition, a series of previously unknown but important enzymes and secondary metabolites including paraquat-inducible proteins, drug and heavy-metal-resistance proteins, multiple chitinases, and proteins for the detoxification of xenobiotics that may have biotechnological applications

    Exclusive α-helices in all-α proteins

    No full text
    Dataset containing physical-chemical and strucutural descriptors for exclusive α-helices (only a single helical structure segment is present in the protein's whole structure) and its flanking regions.<div><br></div><div>Only alpha helices from all-α <b>proteins </b>are eligible for this dataset.<br><div><br><div>Data is organized into sections. Each section begins with the following header: <b>length, pdb_name, chain_id, flag, position, [parameter]</b>.<br></div></div><div><br></div><div><b>length</b> is the number of amino acid residues comprising the α-helix.</div><div><br></div><div><b>flag</b> indicates the amino acid residue underlying region, namely: flanking region before the α-helix (<b><</b>); α-helix (<b>=</b>); flanking region after the α-helix (<b>></b>).</div><div><br></div><div><b>[parameter]: </b>paramater's value for the given amino acid residue. All parameters are extracted from the STING Database.</div><div><br></div><div><b>Example:</b></div><div><b>Solvent-accessible surface area </b><b>for all amino acid residues forming an </b>α-helix <b>of length 6</b></div><div><b><br></b></div><div>length, pdb_name, chain_id, flag, position, Accessible_Surface_in_Isolation</div><div>6,1not,A,=,1,136.585<br></div><div><div>6,1not,A,=,2,106.068</div><div>6,1not,A,=,3,11.322</div><div>6,1not,A,=,4,49.517</div><div>6,1not,A,=,5,227.67</div><div>6,1not,A,=,6,110.9</div></div><div><b><br></b></div></div

    α-helices in all-α proteins

    No full text
    <div>Dataset containing physical-chemical and strucutural descriptors for α-helices and its flanking regions.</div><div><br></div><div>Only α-helices from <b>all-α proteins </b>are eligible for this dataset.</div><div><br></div><div>Data is organized into sections. Each section begins with the following header: <b>length, pdb_name, chain_id, flag, position, [parameter]</b>.</div><div><br></div><div><b>length </b>is the number of amino acid residues comprising the α-helix.</div><div><br></div><div><b>flag </b>indicates the amino acid residue underlying region, namely: flanking region before the α-helix(<b><</b>); α-helix(<b>=</b>); flanking region after the α-helix(<b>></b>).</div><div><br></div><div><b>[parameter]</b>: paramater's value for the given amino acid residue. All parameters are extracted from the STING Database.</div

    Exclusive α-helices in (α+β)+(α/β) proteins

    No full text
    <div>Dataset containing physical-chemical and strucutural descriptors for exclusive α-helices and its flanking regions.</div><div><br></div><div>Only α-helices from (α+β)+(α/β) proteins are eligible for this dataset.</div><div><br></div><div>Data is organized into sections. Each section begins with the following header: <b>length, pdb_name, chain_id, flag, position, [parameter]</b>.</div><div><br></div><div><b>length </b>is the number of amino acid residues comprising the α-helix.</div><div><br></div><div><b>flag </b>indicates the amino acid residue underlying region, namely: flanking region before the α-helix (<); α-helix (=); flanking region after the α-helix (>).</div><div><br></div><div><b>[parameter]</b>: paramater's value for the given amino acid residue. All parameters are extracted from the STING Database.</div

    α-helices in (α+β)+(α/β) proteins

    No full text
    <div>Dataset containing physical-chemical and strucutural descriptors for α-helices and its flanking regions.<br></div><div><br></div><div>Only α-helices from <b>(α+β)+(α/β) proteins</b> are eligible for this dataset.</div><div><br></div><div>Data is organized into sections. Each section begins with the following header: l<b>ength, pdb_name, chain_id, flag, position, [parameter]</b>.</div><div><br></div><div><b>length </b>is the number of amino acid residues comprising the α-helix.</div><div><br></div><div><b>flag </b>indicates the amino acid residue underlying region, namely: flanking region before the α-helix(<); α-helix(=); flanking region after the α-helix (>).</div><div><br></div><div><b>[parameter]</b>: paramater's value for the given amino acid residue. All parameters are extracted from the STING Database.</div

    Non-exclusive α-helices in all-α proteins

    No full text
    <div>Dataset containing physical-chemical and strucutural descriptors for non-exclusive α-helices and its flanking regions. </div><div><br></div><div>Only α-helices from <b>all-α proteins</b> are eligible for this dataset.</div><div><br></div><div>Data is organized into sections. Each section begins with the following header: <b>length, pdb_name, chain_id, flag, position, [parameter</b>].</div><div><br></div><div><b>length </b>is the number of amino acid residues comprising the α-helix.</div><div><br></div><div><b>flag </b>indicates the amino acid residue underlying region, namely: flanking region before the α-helix(<b><</b>); α-helix (<b>=</b>); flanking region after the α-helix (<b>></b>).</div><div><br></div><div><b>[parameter]</b>: paramater's value for the given amino acid residue. All parameters are extracted from the STING Database.</div

    Study of specific nanoenvironments containing α-helices in all-α and (α+β)+(α/β) proteins

    No full text
    <div><p>Protein secondary structure elements (PSSEs) such as α-helices, β-strands, and turns are the primary building blocks of the tertiary protein structure. Our primary interest here is to reveal the characteristics of the nanoenvironment formed by both PSSEs and their surrounding amino acid residues (AARs), which might contribute to the general understanding of how proteins fold. The characteristics of such nanoenvironments must be specific to each secondary structure element, and we have set our goal here to gather the fullest possible description of the α-helical nanoenvironment. In general, this postulate (the existence of specific nanoenvironments for specific protein substructures/neighbourhoods/regions with distinct functionality) was already successfully explored and confirmed for some protein regions, such as protein-protein interfaces and enzyme catalytic sites. Consequently, PSSEs were the obvious next choice for additional work for further evidence showing that specific nanoenvironments (having characteristics fully describable by means of structural and physical chemical descriptors) do exist for the corresponding and determined intraprotein regions. The nanoenvironment of α-helices (nEoαH) is defined as any region of the protein where this secondary structure element type is detected. The nEoαH, therefore, includes not only the α-helix amino acid residues but also the residues immediately around the α-helix. The hypothesis that motivated this work is that it might in fact be possible to detect a postulated “signal” or “signature” that distinguishes the specific location of α-helices. This “signal” must be discernible by tracking differences in the values of physical, chemical, physicochemical, structural and geometric descriptors immediately before (or after) the PSSE from those in the region along the α-helices. The search for this specific nanoenvironment “signal” was made possible by aligning previously selected α-helices of equal length. Afterward, we calculated the average value, standard deviation and mean square error at each aligned residue position for each selected descriptor. We applied Student’s t-test, the Kolmogorov-Smirnov test and MANOVA statistical tests to the dataset constructed as described above, and the results confirmed that the hypothesized “signal”/“signature” is both existing/identifiable and capable of distinguishing the presence of an α-helix inside the specific nanoenvironment, contextualized as a specific region within the whole protein. However, such conclusion might rarely be reached if only one descriptor is considered at a time. A more accurate signal with broader coverage is achieved only if one applies multivariate analysis, which means that several descriptors (usually approximately 10 descriptors) should be considered at the same time. To a limited extent (up to a maximum of 15% of cases), such conclusion is also possible with only a single descriptor, and the conclusion is also possible in general for up to 50–80% of cases when no less than 5 nonlinear descriptors are selected and considered. Using all the descriptors considered in this work, provided all assumptions about data characteristics for this analysis are met, multivariate analysis regularly reached a coverage and accuracy above 90%. Understanding how secondary structure elements are formed and maintained within a protein structure could enable a more detailed understanding of how proteins reach their final 3D structure and consequently, their function. Likewise, this knowledge may also improve the tools used to determine how good a structure is by means of comparing the “signal” around a selected PSSE with the one obtained from the best (resolution and quality wise) protein structures available.</p></div
    corecore